Your browser doesn't support javascript.
Show: 20 | 50 | 100
Results 1 - 20 de 117
Filter
1.
Applied Sciences ; 13(11):6515, 2023.
Article in English | ProQuest Central | ID: covidwho-20244877

ABSTRACT

With the advent of the fourth industrial revolution, data-driven decision making has also become an integral part of decision making. At the same time, deep learning is one of the core technologies of the fourth industrial revolution that have become vital in decision making. However, in the era of epidemics and big data, the volume of data has increased dramatically while the sources have become progressively more complex, making data distribution highly susceptible to change. These situations can easily lead to concept drift, which directly affects the effectiveness of prediction models. How to cope with such complex situations and make timely and accurate decisions from multiple perspectives is a challenging research issue. To address this challenge, we summarize concept drift adaptation methods under the deep learning framework, which is beneficial to help decision makers make better decisions and analyze the causes of concept drift. First, we provide an overall introduction to concept drift, including the definition, causes, types, and process of concept drift adaptation methods under the deep learning framework. Second, we summarize concept drift adaptation methods in terms of discriminative learning, generative learning, hybrid learning, and others. For each aspect, we elaborate on the update modes, detection modes, and adaptation drift types of concept drift adaptation methods. In addition, we briefly describe the characteristics and application fields of deep learning algorithms using concept drift adaptation methods. Finally, we summarize common datasets and evaluation metrics and present future directions.

2.
Decision Making: Applications in Management and Engineering ; 6(1):502-534, 2023.
Article in English | Scopus | ID: covidwho-20244096

ABSTRACT

The COVID-19 pandemic has caused the death of many people around the world and has also caused economic problems for all countries in the world. In the literature, there are many studies to analyze and predict the spread of COVID-19 in cities and countries. However, there is no study to predict and analyze the cross-country spread in the world. In this study, a deep learning based hybrid model was developed to predict and analysis of COVID-19 cross-country spread and a case study was carried out for Emerging Seven (E7) and Group of Seven (G7) countries. It is aimed to reduce the workload of healthcare professionals and to make health plans by predicting the daily number of COVID-19 cases and deaths. Developed model was tested extensively using Mean Squared Error (MSE), Root Mean Squared Error (RMSE), Mean Absolute Error (MAE) and R Squared (R2). The experimental results showed that the developed model was more successful to predict and analysis of COVID-19 cross-country spread in E7 and G7 countries than Linear Regression (LR), Random Forest (RF), Support Vector Machine (SVM), Multilayer Perceptron (MLP), Convolutional Neural Network (CNN), Recurrent Neural Network (RNN) and Long Short-Term Memory (LSTM). The developed model has R2 value close to 0.9 in predicting the number of daily cases and deaths in the majority of E7 and G7 countries. © 2023 by the authors.

3.
Journal of Water Resources Planning and Management ; 149(8), 2023.
Article in English | ProQuest Central | ID: covidwho-20242913

ABSTRACT

Water use was impacted significantly by the COVID-19 pandemic. Although previous studies quantitatively investigated the effects of COVID-19 on water use, the relationship between water-use variation and COVID-19 dynamics (i.e., the spatial-temporal characteristics of COVID-19 cases) has received less attention. This study developed a two-step methodology to unravel the impact of COVID-19 pandemic dynamics on water-use variation. First, using a water-use prediction model, the water-use change percentage (WUCP) indicator, which was calculated as the relative difference between modeled and observed water use, i.e., water-use variation, was used to quantify the COVID-19 effects on water use. Second, two indicators, i.e., the number of existing confirmed cases (NECC) and the spatial risk index (SRI), were applied to characterize pandemic dynamics, and the quantitative relationship between WUCP and pandemic dynamics was examined by means of regression analysis. We collected and analyzed 6-year commercial water-use data from smart meters of Zhongshan District in Dalian City, Northeast China. The results indicate that commercial water use decreased significantly, with an average WUCP of 59.4%, 54.4%, and 45.7%during the three pandemic waves, respectively, in Dalian. Regression analysis showed that there was a positive linear relationship between water-use changes (i.e., WUCP) and pandemic dynamics (i.e., NECC and SRI). Both the number of COVID-19 cases and their spatial distribution impacted commercial water use, and the effects were weakened by restriction strategy improvement, and the accumulation of experience and knowledge about COVID-19. This study provides an in-depth understanding of the impact of COVID-19 dynamics on commercial water use. The results can be used to help predict water demand under during future pandemic periods or other types of natural and human-made disturbance.

4.
Epidemic Analytics for Decision Supports in COVID19 Crisis ; : 65-81, 2022.
Article in English | Scopus | ID: covidwho-20237298

ABSTRACT

The COVID-19 pandemic spread generated an urgent need for computational systems to model its behavior and support governments and healthcare teams to make proper decisions. There are not many cases of global pandemics in history, and the most recent one has unique characteristics, which are tightly connected to the current society's lifestyle and beliefs, creating an environment of uncertainty. Because of that, the development of mathematical/computational models to forecast the pandemic behavior since its beginning, i.e., with a restricted amount of data collected, is necessary. This chapter focuses on the analysis of different data mining techniques to allow the pandemic prediction with a small amount of data. A case study is presented considering the data from Wuhan, the Chinese city where the virus was first detected, and the place where the major outbreak occurred. The PNN + CF method (Polynomial Neural Network with Corrective Feedback) is presented as the technique with the best prediction performance. This is a promising method that might be considered in future eventual waves of the current pandemic or event to have a suitable model for future epidemic outbreaks around the world. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022.

5.
Epidemic Analytics for Decision Supports in COVID19 Crisis ; : 17-64, 2022.
Article in English | Scopus | ID: covidwho-20237296

ABSTRACT

A significant number of people infected by COVID19 do not get sick immediately but become carriers of the disease. These patients might have a certain incubation period. However, the classical compartmental model, SEIR, was not originally designed for COVID19. We used the simple, commonly used SEIR model to retrospectively analyse the initial pandemic data from Singapore. Here, the SEIR model was combined with the actual published Singapore pandemic data, and the key parameters were determined by maximizing the nonlinear goodness of fit R2 and minimizing the root mean square error. These parameters served for the fast and directional convergence of the parameters of an improved model. To cover the quarantine and asymptomatic variables, the existing SEIR model was extended to an infectious disease model with a greater number of population compartments, and with parameter values that were tuned adaptively by solving the nonlinear dynamics equations over the available pandemic data, as well as referring to previous experience with SARS. The contribution presented in this paper is a new model called the adaptive SEAIRD model;it considers the new characteristics of COVID19 and is therefore applicable to a population including asymptomatic carriers. The predictive value is enhanced by tuning of the optimal parameters, whose values better reflect the current pandemic. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022.

6.
J Thorac Dis ; 15(6): 2971-2983, 2023 Jun 30.
Article in English | MEDLINE | ID: covidwho-2327718

ABSTRACT

Background: Long-term effects of severe acute respiratory syndrome coronavirus 2 (SARS-COV-2) infection still under study. The objectives of this study were to identify persistent pulmonary lesions 1 year after coronavirus disease 2019 (COVID-19) hospitalization and assess whether it is possible to estimate the probability that a patient develops these complications in the future. Methods: A prospective study of ≥18 years old patients hospitalized for SARS-COV-2 infection who develop persistent respiratory symptoms, lung function abnormalities or have radiological findings 6-8 weeks after hospital discharge. Logistic regression models were used to identify prognostic factors associated with a higher risk of developing respiratory problems. Models performance was assessed in terms of calibration and discrimination. Results: A total of 233 patients [median age 66 years [interquartile range (IQR): 56, 74]; 138 (59.2%) male] were categorized into two groups based on whether they stayed in the critical care unit (79 cases) or not (154). At the end of follow-up, 179 patients (76.8%) developed persistent respiratory symptoms, and 22 patients (9.4%) showed radiological fibrotic lesions with pulmonary function abnormalities (post-COVID-19 fibrotic pulmonary lesions). Our prognostic models created to predict persistent respiratory symptoms [post-COVID-19 functional status at initial visit (the higher the score, the higher the risk), and history of bronchial asthma] and post-COVID-19 fibrotic pulmonary lesions [female; FVC% (the higher the FVC%, the lower the probability); and critical care unit stay] one year after infection showed good (AUC 0.857; 95% CI: 0.799-0.915) and excellent performance (AUC 0.901; 95% CI: 0.837-0.964), respectively. Conclusions: Constructed models show good performance in identifying patients at risk of developing lung injury one year after COVID-19-related hospitalization.

7.
Global Media Journal ; 21(62):1-3, 2023.
Article in English | ProQuest Central | ID: covidwho-2323191

ABSTRACT

Keywords: Agenda;Framing;Social representations;Expectations;Computer Introduction The development of research projects often requires the competition of computers, software and data analysis techniques, but the acceptance, appropriation and intensive use of them presents limitations in terms of utility and risk expectations [1]. Some explanatory models of human capital formation suggest that the formation of talent or intellectual capital in intangible assets of organizations is due to habitus [3]. [...]the predictive models of the social representations of these determinants have not been observed in the explanation of the relations with the intensive use of technologies, devices and electronic networks. [...]the objective of the present work was to establish the academic link relative to the social representations of computer computers, considering the dimensions of the organizational, educational and cognitive models. Methodology A documentary, retrospective and exploratory study was carried out with a selection of sources indexed to international repositories Table 1, considering the indexing period from 2019 to 2021, as well as the search by allusive keywords for negative (stigma, risk, rejection) and positive (utility, acceptance, appropriation) (Table 1) Content analysis and opinion matrices were used, considering the inclusion of findings, ratings and comparisons of coded data such as;-1 for negative dimensions (stigma, risk and rejection) and +1 for positive dimensions (utility, acceptance and appropriation) The qualitative data analysis package was used, considering equation (1) in which the contingency relations and the proportions of probabilities of taking risks in permissible thresholds of human capital formation stand out The contrast of the null hypotheses was made from the estimation of these parameters.

8.
Journal of Advanced Transportation ; 2023, 2023.
Article in English | ProQuest Central | ID: covidwho-2325027

ABSTRACT

This paper presents a new method to quantify the potential user time savings if the urban bus is given preferential treatment, changing from mixed traffic to an exclusive bus lane, using a big data approach. The main advantage of the proposal is the use of the high amount of information that is automatically collected by sensors and management systems in many different situations with a high degree of spatial and temporal detail. These data allow ready adjustment of calculations to the specific reality measured in each case. In this way, we propose a novel methodology of general application to estimate the potential passenger savings instead of using simulation or analytical methods already present in the literature. For that purpose, in the first place, a travel time prediction model per vehicle trip has been developed. It has been calibrated and validated with a historical series of observations in real-world situations. This model is based on multiple linear regression. The estimated bus delay is obtained by comparing the estimated bus travel time with the bus travel time under free-flow conditions. Finally, estimated bus passenger time savings would be obtained if an exclusive bus lane had been implemented. An estimation of the passenger's route in each vehicle trip is considered to avoid average value simplifications in this calculation. A case study is conducted in A Coruña, Spain, to prove the methodology's applicability. The results showed that 18.7% of the analyzed bus trips underwent a delay exceeding 3 min in a 2,448 m long corridor, and more than 33,000 h per year could have been saved with an exclusive bus lane. Understanding the impact of different factors on transit and the benefits of a priority bus system on passengers can help city councils and transit agencies to know which investments to prioritize given their limited budget.

9.
Advances in Multimedia ; 2023, 2023.
Article in English | ProQuest Central | ID: covidwho-2316594

ABSTRACT

There is an "Infodemic” of COVID-19 in which there are a lot of rumours and information disorders spreading rapidly, the purpose of the study is to build a predictive model for identifying whether the COVID-19 information in the Malay language in Malaysia is real or fake. Under the study of COVID-19 fake news detection, the synthetic minority oversampling technique (SMOTE) is used to generate synthetic instances of real news in the training set after natural language processing (NLP) and before data modelling because the number of fake news is approximately three times greater than that of real news. Logistic regression, Naïve Bayes, decision trees, support vector machines, random forests, and gradient boosting are employed and compared to determine the most suitable predictive model. In short, the gradient-boosting classifier model has the highest value of accuracy and F1-score.

10.
International Journal of Information Technology and Decision Making ; 22(3), 2023.
Article in English | ProQuest Central | ID: covidwho-2314833

ABSTRACT

In this research, an effort has been put to develop an integrated predictive modeling framework to automatically estimate the rental price of Airbnb units based on listed descriptions and several accommodation-related utilities. This paper considers approximately 0.2 million listings of Airbnb units across seven European cities, Amsterdam, Barcelona, Brussels, Geneva, Istanbul, London, and Milan, after the COVID-19 pandemic for predictive analysis. RoBERTa, a transfer learning framework in conjunction with K-means-based unsupervised text clustering, was used to form a homogeneous grouping of Airbnb units across the cities. Subsequently, particle swarm optimization (PSO) driven advanced ensemble machine learning frameworks have been utilized for predicting rental prices across the formed clusters of respective cities using 32 offer-related features. Additionally, explainable artificial intelligence (AI), an emerging field of AI, has been utilized to interpret the high-end predictive modeling to infer deeper insights into the nature and direction of influence of explanatory features on rental prices at respective locations. The rental prices of Airbnb units in Geneva and Brussels have appeared to be highly predictable, while the units in London and Milan have been found to be less predictable. Different types of amenity offerings largely explain the variation in rental prices across the cities.

11.
Casopis Za Ekonomiju I Trzisne Komunikacije ; 12(2):364-378, 2022.
Article in English | Web of Science | ID: covidwho-2310033

ABSTRACT

The first models for assessing the insolvency of economic entities were developed more than half a century ago. However, the issue of assessing the creditworthiness of companies is still relevant today, especially in the current business conditions, at the time of the COVID-19 pandemic and geopolitical events in Eastern Europe. In such an extremely dynamic environment, it is very important to anticipate solvency problems in a timely manner, in order to prevent all the negative socio - economic consequences that come with the insolvency of economic entities. This is due to the fact that the company is part of the environment within which it operates and there is interdependence between the company as a legal entity and its environment, and with the disappearance of the company from the market scene there are negative circumstances for various stakeholders. Taking into account the above, the aim of this study is to test the applicability of the model for assessing the insolvency of trade companies in the Republic of Srpska in crisis business conditions. The following models were analyzed: Atlman's Z - score model, Altman's Z "- score model, Zmijewski model, Fulmer's model, Kralicek's model, BEX index and RAPO model. The following indicators were used as indicators of reliability of the analyzed models: sensitivity, specificity, type I error, type II error and the general efficiency rate of each analyzed model.The research covered 455 companies in the field of trade that are registered on the territory of the Republic of Srpska, and which submitted their financial reports to the Agency for Intermediary, Financial and Information Services of the Republic of Srpska for the period 2020-2021. The research hypothesis, which was tested in the paper, reads: Foreign existing models for assessing insolvency, which do not take into account business conditions in the Republic of Srpska, are not applicable in trade activities in Republika Srpska in crisis business conditions. Based on the obtained results, it can be concluded that the main hypothesis in the paper was confirmed, since the highest rate of general efficiency was recorded by the RAPO model, which was developed based on financial ratios of companies from all economic areas in the Republic of Srpska, which enabled it to take into account the socio-economic business conditions of the Republic of Srpska, as well as the mutual influence of economic branches, which is a very important factor of analysis in the period of crisis.

12.
1st Serbian International Conference on Applied Artificial Intelligence, SICAAI 2022 ; 659 LNNS:320-331, 2023.
Article in English | Scopus | ID: covidwho-2292163

ABSTRACT

This paper analyses the possibilities of using Machine learning to develop a forecasting model for COVID-19 with a publicly available dataset from the Johns Hopkins University COVID-19 Data Repository and with the addition of a percentage of each variant from the GISAID Variant database. Genetic programming (GP), a symbolic regressor algorithm, is used for the estimation of new confirmed infected cases, hospitalized cases, cases in intensive care units (ICUs), and deceased cases. This metaheuristics method algorithm was used on a dataset for Austria and neighboring countries Czechia, Hungary, Slovenia, and Slovakia. Machine learning was done to create individual models for each country. Variance-based sensitivity analysis was initiated using the obtained mathematical models. This analysis showed us which input variables the output of the obtained models is sensitive to, like in the case of how much each covid variant affects the spread of the virus or the number of deceased cases. Individual short-term models have achieved very high R2 scores, while long-term predictions have achieved lower R2 scores. © 2023, The Author(s), under exclusive license to Springer Nature Switzerland AG.

13.
Weather and Forecasting ; 38(4):591-609, 2023.
Article in English | ProQuest Central | ID: covidwho-2306472

ABSTRACT

The Prediction of Rainfall Extremes Campaign In the Pacific (PRECIP) aims to improve our understanding of extreme rainfall processes in the East Asian summer monsoon. A convection-permitting ensemble-based data assimilation and forecast system (the PSU WRF-EnKF system) was run in real time in the summers of 2020–21 in advance of the 2022 field campaign, assimilating all-sky infrared (IR) radiances from the geostationary Himawari-8 and GOES-16 satellites, and providing 48-h ensemble forecasts every day for weather briefings and discussions. This is the first time that all-sky IR data assimilation has been performed in a real-time forecast system at a convection-permitting resolution for several seasons. Compared with retrospective forecasts that exclude all-sky IR radiances, rainfall predictions are statistically significantly improved out to at least 4–6 h for the real-time forecasts, which is comparable to the time scale of improvements gained from assimilating observations from the dense ground-based Doppler weather radars. The assimilation of all-sky IR radiances also reduced the forecast errors of large-scale environments and helped to maintain a more reasonable ensemble spread compared with the counterpart experiments that did not assimilate all-sky IR radiances. The results indicate strong potential for improving routine short-term quantitative precipitation forecasts using these high-spatiotemporal-resolution satellite observations in the future.Significance StatementDuring the summers of 2020/21, the PSU WRF-EnKF data assimilation and forecast system was run in real time in advance of the 2022 Prediction of Rainfall Extremes Campaign In the Pacific (PRECIP), assimilating all-sky (clear-sky and cloudy) infrared radiances from geostationary satellites into a numerical weather prediction model and providing ensemble forecasts. This study presents the first-of-its-kind systematic evaluation of the impacts of assimilating all-sky infrared radiances on short-term qualitative precipitation forecasts using multiyear, multiregion, real-time ensemble forecasts. Results suggest that rainfall forecasts are improved out to at least 4–6 h with the assimilation of all-sky infrared radiances, comparable to the influence of assimilating radar observations, with benefits in forecasting large-scale environments and representing atmospheric uncertainties as well.

14.
IEEE Internet of Things Journal ; 10(8):6742-6755, 2023.
Article in English | ProQuest Central | ID: covidwho-2306448

ABSTRACT

In order to control the first wave of COVID-19 pandemic in 2020, many models have shown effectiveness in predicting the spread of new coronary pneumonia and the different interventions. However, few models can collect large amounts of high-quality real-time data faster under the premise of protecting privacy, considering the impact of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) variant and the mass vaccination program as a new intervention. Therefore, we developed a mobile intelligent application that can collect a large amount of real-time data while protecting privacy and conducted a feasibility study by defining a new COVID-19 mathematical model SEMCVRD. By simulating different intervention measures, the prediction model of the mobile intelligent application used in this article simulates the epidemic situation in the U.K. as an example. The findings are as below: the optimal intervention strategy is to suppress the intervention at [Formula Omitted] (intervention intensity: the average number of contacts per person per day) before the end of March 2021, then gradually release the intervention intensity at a rate of [Formula Omitted], and finally release the intensity to [Formula Omitted] in June 2021. The COVID-19 pandemic will end at the end of June 2021, when the total number of deaths will reach 128772. This strategy will be able to balance the tradeoff between loss of life and economic loss. Compared with the official statistics released by the U.K. government on May 31, 2021, our model can accurately predict the relative error rate of the total number of cases is less than 6.9%, and the relative error rate of the total number of deaths is less than 1%. Furthermore, the model is also suitable for collecting data from countries/regions around the world.

15.
Systems ; 11(4):201, 2023.
Article in English | ProQuest Central | ID: covidwho-2302147

ABSTRACT

Artificial intelligence (AI) technology plays a crucial role in infectious disease outbreak prediction and control. Many human interventions can influence the spread of epidemics, including government responses, quarantine, and economic support. However, most previous AI-based models have failed to consider human interventions when predicting the trend of infectious diseases. This study selected four human intervention factors that may affect COVID-19 transmission, examined their relationship to epidemic cases, and developed a multivariate long short-term memory network model (M-LSTM) incorporating human intervention factors. Firstly, we analyzed the correlations and lagged effects between four human factors and epidemic cases in three representative countries, and found that these four factors typically delayed the epidemic case data by approximately 15 days. On this basis, a multivariate epidemic prediction model (M-LSTM) was developed. The model prediction results show that coupling human intervention factors generally improves model performance, but adding certain intervention factors also results in lower performance. Overall, a multivariate deep learning model with coupled variable correlation and lag outperformed other comparative models, and thus validated its effectiveness in predicting infectious diseases.

16.
International Journal of Data Mining and Bioinformatics ; 27(1-3):139-170, 2022.
Article in English | ProQuest Central | ID: covidwho-2300618

ABSTRACT

Mobile money has been known to be a successful venture around the world especially so, for African countries due to the many limitations that traditional banks have like operations, expensive transaction costs and cumbersome process to open account to mention but a few. The presence of mobile money has not only allowed the unbanked population to have accounts but has also alleviated poverty for many rural communities. Zambia has seen an increase of mobile money accounts and COVID-19 has exacerbated this increase. Therefore, this paper sought to determine data mining algorithm best predicts mobile money transaction growth. This paper was quantitative in nature and used aggregated monthly mobile money data (from Zambian mobile network operators) from 2013 to 2020 as its sample which was collected from Bank of Zambia and Zambia Information Communications and Technology Authority. The paper further used WEKA data mining tool for data analysis following the Cross-Industrial Standard Process for data mining guidelines. The performance from best to least is K-nearest neighbour, random forest, support vector machines, multilayer perceptron and linear regression. The predictions from data mining techniques can be deployed to predict growth of mobile money and hence be used in financial inclusion policy formulation and other strategies that can further improve service delivery by mobile network operators.

17.
International Journal of Information Engineering and Electronic Business ; 14(1):1, 2021.
Article in English | ProQuest Central | ID: covidwho-2300239

ABSTRACT

In early 2020, the world was shocked by the outbreak of COVID-19. World Health Organization (WHO) urged people to stay indoors to avoid the risk of infection. Thus, more people started to shop online, significantly increasing the number of e-commerce users. After some time, users noticed that a few irresponsible online retailers misled customers by hiking product prices before and during the sale, then applying huge discounts. Unfortunately, the "discounted” prices were found to be similar or only slightly lower than standard pricing. This problem occurs because users were unable to monitor product pricing due to time restrictions. This study proposes a Web application named PriceCop to help customers' monitor product pricing. PriceCop is a significant application because it offers price prediction features to help users analyse product pricing within the next day;thus, it can help users to plan before making purchases. The price prediction model is developed by using Linear Regression (LR) technique. LR is commonly used to determine outcomes and used as predictors. Least Squares Support Vector Machine (LSSVM) and Artificial Bee Colony (ABC) are used as a comparison to evaluate the accuracy of the LR technique. LSSVM-ABC was initially proposed for stock market price predictions. The results show the accuracy of pricing prediction using LSSVM-ABC is 84%, while it is 62% when LR is employed. ABC is integrated into SVM to optimize the solution and is responsible for the best solution in every iteration. Even though LSSVM-ABC predicts product pricing more accurately than LR, this technique is best trained using at least a year's worth of product prices, and the data is limited for this purpose. In the future, the dataset can be collected daily and trained for accuracy.

18.
Region ; 10(1):113-132, 2023.
Article in English | Scopus | ID: covidwho-2299526

ABSTRACT

A significant amount of research has been conducted regarding the resilience of the regions and the factors that contribute to allow them to face challenges, crises, or disasters. The rise of promising sectors like Machine learning (ML) and Artificial Intelligence (AI) can enhance this research using computing power in regional economic, social, and environmental data analysis to find patterns and create prediction models. Through Machine Learning, the following research introduces the use of models that can predict the performance of a region in disasters. A case study of the performance of USA Counties during the Covid19 first wave period of the pandemic and the related restrictions that were applied by the authorities was used in order to reveal the obvious or hidden parameters and factors that affected their resilience, in particular their economic response, and other interesting patterns between all the involved attributes. This paper aims to contribute to a methodology and to offer useful guidelines in how regional factors can be translated and processed by data and ML/AI tools and techniques. The proposed models were evaluated on their ability to predict the economic performance of each county and in particular the difference of its unemployment rate between March and June of 2020. The former is based on several economic, social, and environmental data-up to that point in time-using classifiers like neural networks and decision trees. A comparison of the different models' execution was performed, and the best models were further analyzed and presented. Further execution results that identified patterns and connections between regional data and attributes are also presented. The main results of this research are i) a methodological framework of how regional status can be translated into digital models and ii) related examples of predictive models in a real case. An effort was also made to decode the results in terms of regional science to produce useful and meaningful conclusions, thus a decision tree is also presented to demonstrate how these models can be interpreted. Finally, the connection between this work and the strong current trend of regional and urban digitalization towards sustainability is established. © 2023 by the authors. Licensee: REGION-The Journal of ERSA,.

19.
Applied Sciences ; 13(7):4119, 2023.
Article in English | ProQuest Central | ID: covidwho-2295367

ABSTRACT

Machine Learning (ML) methods have become important for enhancing the performance of decision-support predictive models. However, class imbalance is one of the main challenges for developing ML models, because it may bias the learning process and the model generalization ability. In this paper, we consider oversampling methods for generating synthetic categorical clinical data aiming to improve the predictive performance in ML models, and the identification of risk factors for cardiovascular diseases (CVDs). We performed a comparative study of several categorical synthetic data generation methods, including Synthetic Minority Oversampling Technique Nominal (SMOTEN), Tabular Variational Autoencoder (TVAE) and Conditional Tabular Generative Adversarial Networks (CTGANs). Then, we assessed the impact of combining oversampling strategies and linear and nonlinear supervised ML methods. Lastly, we conducted a post-hoc model interpretability based on the importance of the risk factors. Experimental results show the potential of GAN-based models for generating high-quality categorical synthetic data, yielding probability mass functions that are very close to those provided by real data, maintaining relevant insights, and contributing to increasing the predictive performance. The GAN-based model and a linear classifier outperform other oversampling techniques, improving the area under the curve by 2%. These results demonstrate the capability of synthetic data to help with both determining risk factors and building models for CVD prediction.

20.
Atmosphere ; 14(2):311, 2023.
Article in English | ProQuest Central | ID: covidwho-2277674

ABSTRACT

In preparation for the Fourth Industrial Revolution (IR 4.0) in Malaysia, the government envisions a path to environmental sustainability and an improvement in air quality. Air quality measurements were initiated in different backgrounds including urban, suburban, industrial and rural to detect any significant changes in air quality parameters. Due to the dynamic nature of the weather, geographical location and anthropogenic sources, many uncertainties must be considered when dealing with air pollution data. In recent years, the Bayesian approach to fitting statistical models has gained more popularity due to its alternative modelling strategy that accounted for uncertainties for all air quality parameters. Therefore, this study aims to evaluate the performance of Bayesian Model Averaging (BMA) in predicting the next-day PM10 concentration in Peninsular Malaysia. A case study utilized seventeen years' worth of air quality monitoring data from nine (9) monitoring stations located in Peninsular Malaysia, using eight air quality parameters, i.e., PM10, NO2, SO2, CO, O3, temperature, relative humidity and wind speed. The performances of the next-day PM10 prediction were calculated using five models' performance evaluators, namely Coefficient of Determination (R2), Index of Agreement (IA), Kling-Gupta efficiency (KGE), Mean Absolute Error (MAE), Root Mean Squared Error (RMSE) and Mean Absolute Percentage Error (MAPE). The BMA models indicate that relative humidity, wind speed and PM10 contributed the most to the prediction model for the majority of stations with (R2 = 0.752 at Pasir Gudang monitoring station), (R2 = 0.749 at Larkin monitoring station), (R2 = 0.703 at Kota Bharu monitoring station), (R2 = 0.696 at Kangar monitoring station) and (R2 = 0.692 at Jerantut monitoring station), respectively. Furthermore, the BMA models demonstrated a good prediction model performance, with IA ranging from 0.84 to 0.91, R2 ranging from 0.64 to 0.75 and KGE ranging from 0.61 to 0.74 for all monitoring stations. According to the results of the investigation, BMA should be utilised in research and forecasting operations pertaining to environmental issues such as air pollution. From this study, BMA is recommended as one of the prediction tools for forecasting air pollution concentration, especially particulate matter level.

SELECTION OF CITATIONS
SEARCH DETAIL